AITopics | behavior mode

This paper presents a logic programming-based framework for policy-aware autonomous agents that can reason about potential penalties for non-compliance and act accordingly. While prior work has primarily focused on ensuring compliance, our approach considers scenarios where deviating from policies may be necessary to achieve high-stakes goals. Additionally, modeling non-compliant behavior can assist policymakers by simulating realistic human decision-making. Our framework extends Gelfond and Lobo's Authorization and Obligation Policy Language (AOPL) to incorporate penalties and integrates Answer Set Programming (ASP) for reasoning. Compared to previous approaches, our method ensures well-formed policies, accounts for policy priorities, and enhances explainability by explicitly identifying rule violations and their consequences. Building on the work of Harders and Inclezan, we introduce penalty-based reasoning to distinguish between non-compliant plans, prioritizing those with minimal repercussions. To support this, we develop an automated translation from the extended AOPL into ASP and refine ASP-based planning algorithms to account for incurred penalties. Experiments in two domains demonstrate that our framework generates higher-quality plans that avoid harmful actions while, in some cases, also improving computational efficiency. These findings underscore its potential for enhancing autonomous decision-making and informing policy refinement.

artificial intelligence, logic & formal reasoning, penalty, (16 more...)

arXiv.org Artificial Intelligence

2512.03931

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Texas > Lubbock County > Lubbock (0.04)
North America > United States > Ohio > Butler County > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Infrastructure & Services (0.68)
Transportation > Ground > Road (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Supplementary A Properties of the InfoGAIL

Neural Information Processing SystemsOct-9-2025, 06:06:12 GMT

I ( x; y; c) can be decomposed as I (x; y; c) = I ( y; x) + I ( c; x) I ( y, c; x) = I ( y; x) + I ( c; x) H (y, c) + H (y, c |x) = I ( y; c) I (y; c |x). I ( s, a; s, a) is finally increased as well. The main parameters for training Ess-InfoGAIL are listed in Table 4. To minimize computational time, we restrict the update of the latent skill distribution to only the first iteration of policy updates. Our experiments demonstrate that this approach does not result in significant performance degradation.

artificial intelligence, degree, machine learning, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

bcf26768143c94bd36e363cd4bf5daf0-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 06:06:09 GMT

demonstration, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
(4 more...)

Add feedback

Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control

Islam, SM Mazharul, Huber, Manfred

arXiv.org Artificial IntelligenceAug-20-2025

A policy in deep reinforcement learning (RL), either deterministic or stochastic, is commonly parameterized as a Gaussian distribution alone, limiting the learned behavior to be unimodal. However, the nature of many practical decision-making problems favors a multimodal policy that facilitates robust exploration of the environment and thus to address learning challenges arising from sparse rewards, complex dynamics, or the need for strategic adaptation to varying contexts. This issue is exacerbated in continuous control domains where exploration usually takes place in the vicinity of the predicted optimal action, either through an additive Gaussian noise or the sampling process of a stochastic policy. In this paper, we introduce Categorical Policies to model multimodal behavior modes with an intermediate categorical distribution, and then generate output action that is conditioned on the sampled mode. We explore two sampling schemes that ensure differentiable discrete latent structure while maintaining efficient gradient-based optimization. By utilizing a latent categorical distribution to select the behavior mode, our approach naturally expresses multimodality while remaining fully differentiable via the sampling tricks. We evaluate our multimodal policy on a set of DeepMind Control Suite environments, demonstrating that through better exploration, our learned policies converge faster and outperform standard Gaussian policies. Our results indicate that the Categorical distribution serves as a powerful tool for structured exploration and multimodal behavior representation in continuous control.

arxiv preprint arxiv, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2508.13922

Country: North America > United States > Texas > Tarrant County > Arlington (0.04)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

SwitchVLA: Execution-Aware Task Switching for Vision-Language-Action Models

Li, Meng, Zhao, Zhen, Che, Zhengping, Liao, Fei, Wu, Kun, Xu, Zhiyuan, Ren, Pei, Jin, Zhao, Liu, Ning, Tang, Jian

arXiv.org Artificial IntelligenceJun-5-2025

Robots deployed in dynamic environments must be able to not only follow diverse language instructions but flexibly adapt when user intent changes mid-execution. While recent Vision-Language-Action (VLA) models have advanced multi-task learning and instruction following, they typically assume static task intent, failing to respond when new instructions arrive during ongoing execution. This limitation hinders natural and robust interaction in dynamic settings, such as retail or household environments, where real-time intent changes are common. We propose SwitchVLA, a unified, execution-aware framework that enables smooth and reactive task switching without external planners or additional switch-specific data. We model task switching as a behavior modulation problem conditioned on execution state and instruction context. Expert demonstrations are segmented into temporally grounded contact phases, allowing the policy to infer task progress and adjust its behavior accordingly. A multi-behavior conditional policy is then trained to generate flexible action chunks under varying behavior modes through conditioned trajectory modeling. Experiments in both simulation and real-world robotic manipulation demonstrate that SwitchVLA enables robust instruction adherence, fluid task switching, and strong generalization-outperforming prior VLA baselines in both task success rate and interaction naturalness.

artificial intelligence, arxiv preprint arxiv, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.03574

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Diverse Natural Behaviors for Enhancing the Agility of Quadrupedal Robots

Fu, Huiqiao, Dong, Haoyu, Xu, Wentao, Zhou, Zhehao, Deng, Guizhou, Tang, Kaiqiang, Dong, Daoyi, Chen, Chunlin

arXiv.org Artificial IntelligenceMay-16-2025

Achieving animal-like agility is a longstanding goal in quadrupedal robotics. While recent studies have successfully demonstrated imitation of specific behaviors, enabling robots to replicate a broader range of natural behaviors in real-world environments remains an open challenge. Here we propose an integrated controller comprising a Basic Behavior Controller (BBC) and a Task-Specific Controller (TSC) which can effectively learn diverse natural quadrupedal behaviors in an enhanced simulator and efficiently transfer them to the real world. Specifically, the BBC is trained using a novel semi-supervised generative adversarial imitation learning algorithm to extract diverse behavioral styles from raw motion capture data of real dogs, enabling smooth behavior transitions by adjusting discrete and continuous latent variable inputs. The TSC, trained via privileged learning with depth images as input, coordinates the BBC to efficiently perform various tasks. Additionally, we employ evolutionary adversarial simulator identification to optimize the simulator, aligning it closely with reality. After training, the robot exhibits diverse natural behaviors, successfully completing the quadrupedal agility challenge at an average speed of 1.1 m/s and achieving a peak speed of 3.2 m/s during hurdling. This work represents a substantial step toward animal-like agility in quadrupedal robots, opening avenues for their deployment in increasingly complex real-world environments.

artificial intelligence, machine learning, robot, (16 more...)

arXiv.org Artificial Intelligence

2505.09979

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.67)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Architecture for Simulating Behavior Mode Changes in Norm-Aware Autonomous Agents

Glaze, Sean, Inclezan, Daniela

arXiv.org Artificial IntelligenceFeb-13-2025

This paper presents an architecture for simulating the actions of a norm-aware intelligent agent whose behavior with respect to norm compliance is set, and can later be changed, by a human controller. Updating an agent's behavior mode from a norm-abiding to a riskier one may be relevant when the agent is involved in time-sensitive rescue operations, for example. We base our work on the Authorization and Obligation Policy Language AOPL designed by Gelfond and Lobo for the specification of norms. We introduce an architecture and a prototype software system that can be used to simulate an agent's plans under different behavior modes that can later be changed by the controller. We envision such software to be useful to policy makers, as they can more readily understand how agents may act in certain situations based on the agents' attitudes towards norm-compliance. Policy makers may then refine their policies if simulations show unwanted consequences.

agent, artificial intelligence, behavior mode, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.4204/EPTCS.416.7

2502.09215

Country:

North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Government (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

From Foresight to Forethought: VLM-In-the-Loop Policy Steering via Latent Alignment

Wu, Yilin, Tian, Ran, Swamy, Gokul, Bajcsy, Andrea

arXiv.org Artificial IntelligenceFeb-10-2025

While generative robot policies have demonstrated significant potential in learning complex, multimodal behaviors from demonstrations, they still exhibit diverse failures at deployment-time. Policy steering offers an elegant solution to reducing the chance of failure by using an external verifier to select from low-level actions proposed by an imperfect generative policy. Here, one might hope to use a Vision Language Model (VLM) as a verifier, leveraging its open-world reasoning capabilities. However, off-the-shelf VLMs struggle to understand the consequences of low-level robot actions as they are represented fundamentally differently than the text and images the VLM was trained on. In response, we propose FOREWARN, a novel framework to unlock the potential of VLMs as open-vocabulary verifiers for runtime policy steering. Our key idea is to decouple the VLM's burden of predicting action outcomes (foresight) from evaluation (forethought). For foresight, we leverage a latent world model to imagine future latent states given diverse low-level action plans. For forethought, we align the VLM with these predicted latent states to reason about the consequences of actions in its native representation--natural language--and effectively filter proposed plans. We validate our framework across diverse robotic manipulation tasks, demonstrating its ability to bridge representational gaps and provide robust, generalizable policy steering. Videos can be found on the project website: https://yilin-wu98.github.io/forewarn/.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.01828

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Europe > Monaco (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

Collaborating Authors

behavior mode

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

bcf26768143c94bd36e363cd4bf5daf0-Supplemental-Conference.pdf

bcf26768143c94bd36e363cd4bf5daf0-Paper-Conference.pdf

Autonomous Agents and Policy Compliance: A Framework for Reasoning About Penalties

Supplementary A Properties of the InfoGAIL

bcf26768143c94bd36e363cd4bf5daf0-Paper-Conference.pdf

Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control

SwitchVLA: Execution-Aware Task Switching for Vision-Language-Action Models

Learning Diverse Natural Behaviors for Enhancing the Agility of Quadrupedal Robots

Architecture for Simulating Behavior Mode Changes in Norm-Aware Autonomous Agents

From Foresight to Forethought: VLM-In-the-Loop Policy Steering via Latent Alignment